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Abstract. The concept of a fitness landscape is a powerful metaphor that offers 
insight into various aspects of evolutionary processes and guidance for the study of 
evolution. Until recently, empirical evidence on the ruggedness of these landscapes 
was lacking, but since it became feasible to construct all possible genotypes containing 
combinations of a limited set of mutations, the number of studies has grown to a 
point where a classification of landscapes becomes possible. The aim of this review 
is to identify measures of epistasis that allow a meaningful comparison of fitness 
landscapes and then apply them to the empirical landscapes to discern factors that 
affect ruggedness. The various measures of epistasis that have been proposed in the 
literature appear to be equivalent. Our comparsion shows that the ruggedness of 
the empirical landscape is affected by whether the included mutations are beneficial 
or deleterious and by whether intra- or intergenic epistasis is involved. Finally, the 
empirical landscapes are compared to landscapes generated with the Rough Mt. Fuji 
model. Despite the simplicity of this model, it captures the features of the experimental 
landscapes remarkably well. 
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1. Introduction 

How genotypes map onto phenotypes is one of tlie central questions in biology. 
Developmental and systems biologists seek to understand the physical, biochemical 
and physiological basis of the genotype-phenotype map, while evolutionary biologists 
study its evolutionary causes and consequences [H El E]. To predict the evolutionary 
fate of a genotype it is essential to understand how genotypes map onto fitness - the 
basic predictor of an organism's evolutionary success. This has led to the notion of 
a fitness landscape [H |5], which is a mapping from the multidimensional genotype 
space to a real-valued measure of fitness. Graphical renderings often depict the fitness 
landscape as a surface above a two-dimensional base plane symbolizing the genotype 
space, but it is clear that such a low- dimensional representation is generally inadequate 
to provide more than a rather superficial, metaphoric description of the evolutionary 
process (for an alternative visualization that does not suffer from this drawback see 
figJT]). The limitations of the two-dimensional representation have spawned much 
fundamental criticism of the fitness landscape concept. Here, rather than abandoning 
the concept altogether, we take the view that "fitness landscapes. . .should be studied in 
less picturesque but more quantitative ways" [6]. 

Within the fitness landscape metaphor, adaptation is imagined as a hill-climbing 
process leading the population to a fitness peak, with distinct roles for both natural 
selection and genetic drift |7i |8]. The structure of the fitness landscape can range from 
smooth with few accessible peaks to rugged with multiple peaks separated by valleys of 
low fitness. Whether the landscape is smooth or rugged has important consequences for 
evolution [9l |T0] . For instance, the topography of the fitness landscape affects speciation 
via reproductive isolation [HI [T2] , the evolutionary benefits of sex and recombination 
[131 [H], evolution of genetic robustness and evolvability [151 [161 [17] , cind predictability 
of evolution [3 [181 [IHl 120] . 

Little is known about the factors that determine the topography of a fitness 
landscape, beyond the general notion that epistasis is involved. The term epistasis 
as defined by Fisher [21] includes all deviations from the additive effects of alleles at 
different loci, and is usually considered for two alleles only. To understand the role of 
epistasis in shaping the structure of fitness landscapes we need to distinguish between 
magnitude and sign epistasis [22] . Magnitude epistasis is present when the fitness effect 
of a mutation at a given locus has a definite sign (beneficial or deleterious) irrespective 
of the alleles at other loci, while the magnitude depends on the genetic background. 
Magnitude epistasis does not constrain accessibility of mutational trajectories and only 
affects the curvature of a landscape, which can be quantified by a quadratic regression 
of mean fitness on mutation number. On the other hand, sign epistasis occurs when 
mutations are beneficial in some genetic backgrounds, but not in others. Hence, the 
sign (positive or negative) of an allelic effect changes with the presence of an allele at 
another locus, an effect that is intrinsically multidimensional [12j. Sign epistasis causes 
pathways to become inaccessible by natural selection and thus introduces ruggedness 
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Figure 1. The figure shows two examples of empirical fitness landscapes containing 
all combinations of mutations at L = 4 loci. Genotypes are represented by binary 
sequences, where 0(1) indicates the absence (presence) of the corresponding mutation. 
Arrows point in the direction of increasing fitness, and underlined nodes correspond to 
local maxima. Colored arrows point towards the fittest neighbor, forming the basins 
of attraction of a steepest ascent ("greedy") adaptive walk, (a) Fitness landscape 
based on four beneficial mutations in the bacterium Methylbacterium extorquens |25| 
(landscape A in Tables [T] and [2]) . (b) Fitness landscape based on four mutations in a 
malaria drug resistance gene (26) . The fourfold mutant {1111} confers maximal drug 
resistance but does not optimize the growth rate in the absence of the drug, which is 
the quantity used here as a proxy for fitness (landscape D in Tables [T] and [2]) . 



into the landscapes [HI [22]. A special case of sign epistasis, called reciprocal sign 
epistasis, occurs when the sign of both alleles' fitness effects changes with a change of 
alleles at the other locus. Reciprocal sign epistasis is a prerequisite for the occurrence 
of multiple fitness peaks [23l [2l] . 

In theory, with full knowledge of a fitness landscape, one oversees all possible 
evolutionary pathways connecting two genotypes, and would be able to determine the 
likelihood that particular pathways are taken. This would render evolution predictable 
in a restricted (i.e., a posteriori) sense. However, these predictions are valid for a 
specific combination of genotype and environment, and depend on population dynamic 
parameters such as population size and mutation rate. Another limitation is that 
one can only study a tiny part of sequence space explicitly, because the number of 
genotypes grows beyond comprehension with the number of loci considered. Even for a 
single gene of 1000 base-pairs and when allowing only point mutations, the number of 
possible genotypes (4^'°°°) is larger than the total number of particles in the universe, as 
Sewall Wright [5] realized. This scale problem has two immediate implications. First, it 
emphasizes the fundamentally stochastic nature of evolution, given how little of genotype 
space has been probed by hfe since it exists. Second, if we are to use the growing amount 
of information about the genetic make-up of organisms to understand and ultimately 
predict evolution, we need to invoke models of fitness landscapes parametrized by 
empirical observations. 

The purpose of this review is to compare the topographies of empirical fitness 
landscapes that recently have been published. Before doing so, we briefly survey the 
main models of fitness landscapes in which ruggedness can be tuned, as well as the 
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different approaches to study fitness landscapes empirically. Recent efforts have been 
directed towards constructing all 2^ possible genotypes containing combinations of a 
limited set of L mutations, and measuring their fitness or a proxy thereof [2], see figH] 
for two examples. First, we compare different measures of ruggedness and sign epistasis 
derived from the available landscapes, and find that these correlate well and thus appear 
to be equivalent. Second, using these robust measures of ruggedness we can compare 
the ruggedness of different empirical landscapes despite of methodological differences 
and the variety of biological systems involved. We find that those landscapes built 
from mutations that are known to have a combined beneficial effect are less rugged 
than those built from mutations that are selected without regard to their combined 
effect, in particular when they are deleterious. Third, we compare the empirical fitness 
landscapes to a simple statistical one-parameter model, the Rough Mt. Fuji model, 
which combines a linear fitness trend with uncorrelated random fitness variations. We 
find that this model captures the features of the empirical landscapes surprisingly well. 

2. Fitness landscape models 

Until recently empirical information about the structure of fitness landscapes was largely 
unavailable, and the number of studies is currently still small. Therefore, past studies 
of fitness landscapes have been mostly restricted to theoretical work. The models 
proposed in this context are based on very different - and sometimes even contradicting 
- intuitions. In this section, we give a brief overview of the most popular models. 
Throughout we represent genotypes by binary strings a = (cri, cr2, 0"l) of length L, 
where cxj = (1) if the mutation at the i-th locus is absent (present), compare to figJH 

2.1. Kauffman's LK model 

If a given gene A, when expressed, produces an essential protein that requires the 
presence of another protein produced by gene B to function properly, these two genes 
interact epistatically. When gene A is expressed independently of gene an organism 
with a defective mutation in gene B incurs the cost of producing the protein from gene A 
without experiencing its beneficial effect. The mutation in gene B can thus change the 
fitness contribution of gene A from beneficial to deleterious, resulting in sign epistasis. 
The main motivation of the LK model as proposed by Kauffman and Weinberger [27|[28] 
(originally named NK model) is to capture such strong, sign-epistatic effects of single 
mutations in interacting genes on fitness in a statistical sense, without attempting a 
detailed biochemical description. 

The interactions are modeled as evenly spread across the entire genome. The 
genome is represented as a binary sequence of length L and the interactions take the 
form of a set of sites called interaction partners z7j = {i, z/j^i, Vi^2i ■ ■ ■ , ^i,K} associated with 
each site i of the genome. The number K of interaction partners is kept constant. How 
partner sites are chosen has implications for search strategies for the global optimum on 
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the landscapes [21], but the interaction partners are generally chosen by picking them 
uniformly and independently at random (making sure that no site appears twice in a 
given set z^). The fitness of an organism with genome a is then the sum of the individual 
fitness contributions, 



The single site contributions fi are independent and identically distributed random 
variables (i.i.d. RV's) associated with each of the 2^^^^ possible states of the argument. 
If a mutation hits part of the sequence that does not appear in the argument of i.e. if 
the mutation does not involve the site i or any of the associated partner sites in z7j, the 
fitness contribution fi remains unchanged, otherwise it is replaced by an independent 
random number. 

For K = 0, each contribution fi can only take two possible values, corresponding 
to (Tj = or 1, respectively. Thus each site has one state that is more beneficial than 
the othefl and since the fitness of a sequence is the sum over the site contributions, 
the global optimum is at the state with all sites in their 'beneficial' position. The 
global optimum can be reached from any initial configuration by mutating sites into 
their beneficial state in a random order, which implies that all mutational pathways are 
accessible and sign epistasis is absent. Conversely, when K = L — 1, the entire sequence 
appears in the argument of each single site contribution. Thus any mutation replaces 
the sum ([T]) by a sum over a different set of i.i.d. RV's, which is equivalent to replacing 
one i.i.d. RV by another. The number of sites in the partner set, denoted by K, thus 
allows one to tune the strength of epistatic interactions from the non-epistatic limit 
i^' = to the maximally epistatic case K = L — 1. 

2.2. Rough Mount Fuji Models 

The K = Q limit of the LK model can be compared to a smooth (though not necessarily 
symmetric) mountain much like the Mt. Fuji volcano in Japan. This type of fitness 
landscape is therefore sometimes referred to as 'Mt Fuji' landscape. The other extreme 
[K = L — 1) corresponds to a maximally rugged landscape of independent fitness 
contributions and is referred to as 'House of Cards' (HoC) landscape [30) §1 . Intermediate 
values of K correspond to an intermediate degree of ruggedness. An alternative way to 
obtain landscapes with intermediate ruggedness is to pick one genotype oq as point of 
reference and then impose an external 'fitness field' of strength c favoring this reference 
configuration on top of random i.i.d. contributions. Then the fitness of a genotype a is 
given by 



X Provided the fitness values are drawn from a continuous distribution, the probabihty of a tie is zero. 
§ Interpreting the genotype sequences as spin configurations, the HoC landscape becomes equivalent 
to Derrida's Random Energy Model of spin glasses, and the LK-vaodel is a close relative of the p-spin 
model [3T] . 
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where d{dQ, a) denotes the Hamming distance between the two configurations and 77(a) 
is a random variable picked independently for each genotype. This is a simplified version 
of the Rough Mount Fuji model as originally introduced in [32], which has also been 
used in [33] (see also [H]). If the i.i.d. part of the fitness fiuctuates on a scale a, the 
landscape will be dominated by the external field and appear like a smooth landscape 
for c/a ^ 1, while the random contributions will dominate for c/a ^ 1, making the 
landscape appear maximally rugged. Note that the model assumes that the mean fitness 
profile is linear, and thereby ignores unidimensional magnitude epistasis. However, since 
our main interest is in measures of landscape ruggedness, the mean curvature of the 
landscape is not relevant. In section 13.21 we will compare measures of epistasis and 
landscape ruggedness for empirical data to those obtained for landscapes constructed 
with the RMF model (j2]), choosing a Gaussian distribution with standard deviation a 
for the i.i.d. random variables ri{(f). 

2.3. Neutral Models 

A different intuition than used for the LK and RMF models is that the actual fitness 
matters little compared to the question whether or not a given organism is viable at 
all. The genome is composed of a large number of mutually interacting elements and a 
random mutation in any given gene can alter gene function up to the point where a gene 
does no longer function at all. It has therefore been postulated that fitness landscapes 
are dominated by large valleys of lethality and extended ridges of viability [H]. In the 
simplest setting each genotype a has either fitness 1 (i.e. is viable) with probability p 
or has fitness (not viable) with probability 1 — independent of other states. The 
resulting fitness landscape is then equivalent to a realization of the site percolation 
problem [3l] on an L-dimensional hypercube [35]. This type of model can be combined 
with the models described in the preceding subsections by introducing a fraction of 
non-viable genotypes in addition to the epistatically interacting viable genotypes, see 

2.4. Models with explicit phenotypes 

The models described so far intend to incorporate known aspects of the biochemical or 
biological interactions shaping the fitness of a given organism while keeping the number 
of parameters to a minimum. Another strategy is to explicitly incorporate physical, 
chemical and biological mechanisms underlying epistasis into an explicit genotype- 
phenotype map [36]. Such models have been based, for example, on the thermodynamics 
of RNA secondary structure [37] or the biophysics of binding between a transcription 
factor and its binding site [38]. While the development of such models constitutes an 
active branch of research, they are too complex and specific for the type of analyses 
that are of interest in the context of this review. 
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3. Empirical studies of fitness landscapes 

Several approaches have been used to infer topographical details of real fitness 
landscapes from empirical observations; these can be roughly classified into three 
categories. Studies in the first category use the repeatability of adaptation observed 
in microbial evolution experiments to qualitatively assess local ruggedness of fitness 
landscapes. Studies in the second category focus on detecting sign epistasis between 
mutations to infer local ruggedness. The third category includes a limited, but growing 
number of studies that explicitly quantify the multidimensional fitness landscape by 
considering all 2^ combinations of a small set of L mutations. The topographical 
information revealed by the first two approaches is necessarily limited, but refiects 
the contribution of a large number of mutations, while the third category yields more 
detailed information, but from a tiny predefined part of genotype space. In the next 
section, we will briefiy review several studies from the first two categories, and then 
present a more extensive analysis of available studies from the third category. 

3.1. Empirical support for global ruggedness 

By allowing replicate populations of microbes to evolve under identical conditions in the 
laboratory, the dynamics and repeatability of adaptation can be quantified and used to 
infer the general ruggedness of the fitness landscape involved [20l |39] . One expectation 
is that a rugged landscape leads to a stronger and more sustained divergence of fitness 
trajectories than a smooth landscape. This has been found when comparing bacteria 
evolving in a structured and a non-structured habitat 00], or in a complex relative 
to a simple nutrient environment [H]. Another expectation is that only on rugged 
landscapes the ability to adapt depends on the local mutational neighborhood of a 
genotype. In contrast, all genotypes except the globally optimal genotype are able to 
adapt on a smooth single-peaked landscape. Support for this expectation comes from a 
study with RNA bacteriophage 06, where only one of two related genotypes was able to 
adapt under identical conditions |12], and from a study with HIV-1 where adaptation to 
one host-cell type could only be realized indirectly through adaptation to another host- 
cell environment [l3]. Another prediction for rugged landscapes is that higher levels 
of adaptation diminish the ability to adapt to different niches, which was found in a 
study with biofilm-producing bacteria Finally, the short adaptive walks found in 
recent experiments with fungi |l5l HB] also suggest that their fitness landscape is rugged 

[miiH]. 

Attempts to infer topographical information from the dynamics and repeatability 
of adaptation necessarily suffers from being non-systematic. Because such studies 
reveal only those parts of the fitness landscape that have actually been probed, they 
are unable to quantify the ruggedness of the landscape. For instance, the observed 
adaptive dynamics may suggest that there are no strong epistatic constraints, while the 
population may have traveled along a rare ridge of high fitness within a rugged landscape. 
Conclusions also depend on the type of mutations used by evolution, which are specific 
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for the population dynamic regime that prevailed. For instance, in large populations 
where clonal interference plays a major role, large-effect mutations will dominate [19] 
and their epistatic properties may be different from smaller-effect beneficial mutations, 
or even neutral or deleterious mutations that may contribute under different conditions 
[2] . On the other hand, these approaches may probe a more extended area of genotype 
space than the more systematic approach of mutant construction involving a predefined 
and small set of mutations, which is discussed in sect. 13.21 

Epistasis has a clear link to ruggedness of fitness landscapes. Several studies 
confirmed the role of epistasis in causing adaptive constraints and local ruggedness by 
using isolated or constructed mutants in replay experiments to test their evolutionary 
consequences [501 HI EI]- Studies which examine pairwise interactions within sets 
of mutations often detect sign epistasis, and also provide information regarding its 
frequency [2], implying that local ruggedness is not uncommon. For example, a 
study on beneficial mutations that increase the growth rate of the ssDNA microvirid 
bacteriophage ID 11 found significant evidence for sign epistasis in six out of 18 
constructed combinations [52]. A study by Sanjuan et al. [53] on vesicular stomatitis 
virus identified five out of 15 cases in which the combination of two mutations was less 
fit than either of the single mutants. Apart from studies that focus on a relatively small 
number of well-characterized mutations, the ubiquity of (pairwise) epistatic interactions 
has also been documented in recent genome- wide surveys [HEl]. Mutation combinations 
for which sign epistasis is identified point to the local ruggedness of the fitness landscape, 
but do not reveal the global structure of the landscape [T2] . 

3.2. Explicit low- dimensional fitness landscapes 

The existence of sign epistatic interactions between mutations reveals that landscape 
topography can be rugged, but a more systematic approach is required to quantify 
the degree of ruggedness of fitness landscapes, and to determine how this constrains 
evolution. Given the large number of publications on fitness landscapes, the number 
of studies on empirical fitness landscapes is remarkably small. Full information is 
available when the fitness of all 2^ combinations of a set of L mutations is known. 
At present, available empirical landscapes stem from a variety of systems and involve 
small numbers (i.e., 4-9) of mutations (see tabled]). In reality, adaptation proceeds by 
selection on all possible mutations in the genome and is not necessarily limited to such 
a small subset. These landscapes thus only offer a glimpse of the ruggedness within 
the immense genotype space. Given this limitation, we cannot compare ruggedness 
between different empirical fitness landscapes without a clear view on which mutations 
are involved and which part of genotype space is being mapped. As it turns out - despite 
of their low number - the available landscapes are rather different in several aspects, 
and include mutations in single genes versus whole genomes, with fitness effects that in 
some cases are known a priori to be beneficial or deleterious (for individual mutations 
or for the combination studied) and in other cases emerge only a posteriori. For each of 
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Table 1. General eharacteristies of the empirical fitness landscapes included in this 
review. The table lists the number of loci involved, the number of available genotypes, 
the fitness (proxy) that is measured for each combination, and the type of mutations 
included in the landscapes. The organism is indicated (in italics) when landscapes 
are based on genome-wide mutations or the gene name is provided (in upright letters) 
when mutations are located on a single gene. Columns 6 and 7 indicate whether the 
included mutations were known or expected to be beneficial or deleterious, individually 
and/or in combination. The results of our quantitative analyses for the landscapes A- J 



ID 


System 


L 


Available 


b itness 


Direction of 


Known 


Ref. 




( organism / gene) 




combinations 


(proxy) 


mutations 


effects 




A 


Methylohacterium 


4 


16/16 


Growth rate 


Beneficial 


Combined 


m 




extorquens 














B 


Escherichia 


5 


32/32 


Fitness 


Beneficial 


Combined 


[55] 


C-D 


coli 

Dihydrofolate 


4 


16/16 


Resistance/ 


Beneficial 


Individual/ 


[26] 




reductase 






Growth rate 




Combined 




E 


/3-lactamase 


5 


32/32 


Resistance 


Beneficial 


Combined'' 


[56] 


F 


/3-lactamase 


5 


32/32 


Resistance 


BeneficiaP 


Combined'^ 


m 


G 


Saccharomyces 


6 


64/64 


Growth rate 


Deleterious 


Individual 


[HE] 




cerevisiae 














H 


Aspergillus 


8 


186/256^^ 


Growth rate 


Deleterious 


Individual 


[33] 




niger 














I-J 


Terpene synthase 


9 


418/512^1 


Enzymatic 






[59] 










specificity'^ 










Dihydrofolate 


5^ 


29/48^ 


Resistance/ 


Beneficial 


Individual/ 


m 




reductase 






Growth rate 




Combined 






Dihydrofolate 


5^ 


29/48*^ 


Resistance/ 


Beneficial 


Individual/ 


m 




reductase 






Growth rate 




Combined 






HIV-1 envelope 


7 


56/128S 


Infectivity 


Beneficial 


Individual/ 


[62] 




glycoprotein gpl20 










Combined 






Isopropylmalate 


6^ 


164/512S 


Performance/ 






[63] 




dehydrogenase 






Fitness 









^ The mutants where chosen to maximize drug resistance but do not optimize the growth rate in the absence of 
the drug. 

^ The highly resistant genotype resulted from gene- shuffling, which implies that an accessible pathway between 
the wildtype and this mutant did not necessarily exist. 

The same mutants as in [561 were studied with respect to piperacillin-|-inhibitor resistance. Due to the strong 
anti correlation between cefotaxime- and piperacillin-|-inhibitor-resistance the wildtype was expected to be 
exceptionally fit. 

The remaining combinations were missing, either by chance or because the corresponding phenotypes are 
not viable. The studies [601 E] were excluded from further analysis because of the large number of missing 
combinations. 

° The study considers mutational pathways connecting two terpene synthases, TEAS and HPS. Enzymatic 
specificity is the relative proportion of the natural product of TEAS (landscape I) and HPS (landscape J) 
among the total catalytic output of the mutated enzymes. 

^ More than one mutation was included for at least one locus, hence the number of possible combinations is larger 
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the empirical landscape studies included in our analyses (and a few which we did not 
include), table [1] summarizes the main characteristics. 

3.2.1. Quantitative measures of landscape ruggedness and epistasis. Various statistical 
measures have been proposed to quantify the ruggedness of fitness landscapes. Most 
studies focused on different aspects of landscape topography and a variety of measures 
has consequently been applied. Here, we aim to analyze all landscapes using a common 
and standardized selection of measures. This enables a comparison between landscapes, 
but also allows us to verify whether model landscapes actually capture the topography of 
the empirical landscapes. Furthermore, we explore the correlations between the different 
measures of ruggedness to see whether they are equivalent or yield complementary 
information about the topography of real fitness landscapes. In total, we use six 
measures for our analyses: 

(1) The roughness to slope ratio, r/s, was introduced in [M] and used in [6l[T8]. This 
ratio measures how well the landscape can be described by a linear model, which 
corresponds to the purely additive (non-epistatic) limit. It is obtained by fitting 
a multidimensional linear model to the empirical fitness landscape by means of a 
least-square fit. The linear ansatz is 

rHa) = a(°) + X:ar-„ (3) 

where the parameters a^"^ and the a^^'s are fitted. The mean slope is 



1 ^ 



(4) 



and the roughness is defined by 



l2-^^{f{a)-f^^{a)y. (5) 

The higher r/s, the higher the deviation from the linear model and the more 
epistasis is present in the landscape. For example, a purely additive landscape 
has r/s = 0, while for the HoC model r/s — > oo for L — t- oo. 

A versatile set of measures is provided by the Fourier analysis of fitness landscapes 
introduced in [65]. Here, the fitness landscape is expanded in terms of the 
eigenvectors of the Laplacian on the underlying genotype network (in our case, 
the L-dimensional hypercube). The Laplacian is defined as A = A — LI, where 
A is the adjacency matrix and 1 is the unit matrix of dimension 2^ x 2^. Note 
that this matrix has 2^ eigenvalues, and thus eigenstates, but that the n-th non- 
negative eigenvalue comes with a multiplicity given by the binomial coefficient (^) , 
such that the eigenvalues take only L + 1 different values. The expansion of the 
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fitness landscape into the eigenvectors A-'" of A is equivalent to an expansion in 
terms corresponding to epistatic interactions of different orders, i.e. 



.2; 



IL 



L L 

= a(°) + ~^f^3 + Y ^fk^i^'^ + ■■■ + a^^^&i&2 • • • ^L, (6) 

j=i j.k=i 
j>k 

where the 6's and the a's are the coefficients of the expansion, and we have 
introduced symmetric 'spin' variables aj = 2a j — 1 = ±1 for convenience [66] . 
Note that the n-th term in the upper expression is equal to the n-th term in the 
lower one. The term fo^'^^Al'' = a^^^ is a constant which yields no information 
about epistasis. The second term sums the contributions in the directions of the 
eigenvectors corresponding to the second smallest eigenvalue and describes the 
additive, non-epistatic, part of the fitness landscape. The remaining terms describe 
epistatic interactions of increasing order. Defining 

Fn = -P- With (3^ = ))^ n = 1, L, (7) 

l^j=i Pi j=i 

we obtain measures for the contributions to the fitness landscape for both non- 
epistatic and epistatic interactions. The Fn are normalized to add up to 1, 
Sn=i fn = I. If one is only interested in the total contribution of epistasis one 
should take the sum over all terms corresponding to the interaction part, yielding 
the epistasis measure 

L 

-Fsum = y^^Fj. (8) 
i=2 

For a purely additive landscape Fi = 1, and thus Fsum = 0. For a completely 
random (HoC) landscape Fi — > for L — > 00. When interested in the contributions 
of second, third or higher-order interactions one can analyze the terms F2, F3, etc. 
separately. Note that the Fourier analysis described here is equivalent to an analysis 
of variance (ANOVA) commonly used by biologists, which was employed in [67] to 
estimate the contribution of main effects (Fi) and all possible interactions {F2, F^, 
F4 and F5 summing up to Fsum) among two sets of five mutations in the fungus 
Aspergillus niger. 

(3) A frequently used measure of landscape ruggedness is the number of local fitness 
maxima iVmax, which exceeds unity only in the presence of reciprocal sign epistasis 
[21]. For the HoC model it is easy to see that A^max = on average [68], [28] . 
while the maximal possible value for any (binary) fitness landscape is A^max = 2^~^ 
[U ES]. Asymptotic expressions for the mean number of local maxima have been 
derived for the LfsT-model [TQl [HI US] as well as for the RMF model [73]. Note 
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that, like all quantifiers that only depend on the ordering of fitness values, A^max is 
insensitive to magnitude epistasis. 

(4) While the quantities introduced so far are global measures of ruggedness, it is also of 
interest to characterize epistatic interactions locally. As a convenient local measure 
of epistasis we examined all pairs of genotypes in the landscape with a Hamming 
distance of 2, and counted the fraction fs of local motifs showing 'simple' sign 
epistasis (i.e., the effect sign of a mutation at locus i depends on the state of locus 
j but not vice versa), and the fraction showing reciprocal sign epistasis, /r [23]. For 
a HoC landscape, the expected values for these quantities are /s = /r = 1/3, while 
for a purely additive landscape both vanish. 

(5) Several recent studies apply measures of epistasis which are based on the notion 
of evolutionarily accessible pathways, which are connected paths of single step 
mutations along which fitness increases monotonically [22l EHl ESI ESI CH]- Of 
particular interest are the direct (shortest) paths to the global fitness maximum, 
since they provide a clear signature for the presence of sign epistasis: in a landscape 
without sign epistasis, all paths from an arbitrary genotype to the global maximum 
are accessible, while at least some of these paths become inaccessible in the presence 
of sign epistasis [22]. Following [561 133] we count the number of crossing accessible 
paths Ncp that lead to the fittest genotype starting from the reversal (antipodal) 
genotype at distance L. For purely additive landscapes A'cp = L\, while A^cp = 1 on 
average for a landscape compatible with the HoC model, independent of the size L 
of the landscape [33] . 

(6) Besides the number of crossing paths A'cp introduced above, there is a large group of 
estimators of ruggedness and epistasis that rely on counting the number or length of 
accessible paths, e.g., the length and number of paths with a monotonic increase in 
fitness from the genotype with the lowest fitness to the global optimum [71]. Some 
measures allow for detours while others do not. Measures that include neutral 
or double mutations into paths have also been applied [75]. Other definitions do 
not take the location of the starting or endpoints into account, but ask for the 
length of the longest path that always leads from a state to its fittest neighbor 
(greedy walks) [171 EHl [76] or that only admits states with exactly one fitter 
neighbor along the path [77]. While these path measures can yield interesting 
information about evolutionary dynamics, they are often less suitable to quantify 
epistasis because they correlate non-monotonically with conventional measures of 
landscape ruggedness. As an example, we consider the ratio /mm between the 
number of accessible paths from the least fit to the fittest state of the landscape 
divided by the number of such paths accessible on a purely additive landscapes, 
allowing for arbitrary detours. In fig. [2] we plot /mm obtained from simulations 
of the RMF model (sect. 12.21) for a range of values of the slope c, while fixing 
the fluctuation parameter a = 0.1. Recall that the increase of c from to cxd 
corresponds to the transformation from a completely rugged to a perfectly additive 
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Figure 2. The number of fitness monotonic (evolutionarily accessible) paths from the 
global fitness minimum to the global fitness maximum, divided by the corresponding 
number in a non-epistatic landscape, /mm, is plotted vs. the slope of the RMF model 
landscape ([2]), c. Measurements were carried out on landscapes of size L = A and 
averaged over 10,000 realizations of the landscape for each choice of c. Note that /mm 
depends non-monotonically on c. 



landscape. Thus, the amount of epistasis in the landscape decreases monotonically 
with increasing c, while /mm shows a pronounced maximum at an intermediate 
value of c, i.e., the dependence of /mm on the amount of epistasis is non- monotonic. 
We nevertheless include /mm in our analyses to emphasize that epistasis does not 
only imply adaptive constraints, and may sometimes even promote evolvability by 
allowing detours. Such detours are not accessible in purely additive landscapes, 
and may lead to /mm > 1 (see also [HI [75]). In contrast, the other path-dependent 
quantity A^^cp does have a monotonic dependence on epistasis parameters like s in 
the RMF model, or K in the LK model |33] . 

3.2.2. Standardizing the data sets. Before presenting the data, one should note that the 
expected values of the above quantities, for a given amount of epistasis, may depend on 
the size of the underlying landscape. In general, we lack analytical predictions on how 
landscape size affects our measures, and we therefore restrict the analysis to subgraphs 
of the same size. Subgraph analysis of fitness landscapes was introduced in [33] as a 
means to probe the effect of the mutational distance scale within a fitness landscape. 
Here a subgraph of size m is the hypercube spanned by all 2™ combinations of m out of 
L mutations. For landscapes of size L > 4 we calculated the topographic measures for 
all subgraphs of size m = 4 that contained at least eight viable and known states and 
averaged the values over the subgraphs. For the landscapes of size L = 4 (A, C, and D) 
the calculated values refer to the complete landscape. 

Furthermore, one should keep in mind that the fit to the multi-dimensional linear 
model and the Fourier analysis presume that mutations interact additively in absence 
of epistasis. How the effects of mutations add up in the interaction free case will. 
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Figure 3. The measured fitness values, fi, for tire /3-lactairrase resistance landscape 
E [56] are plotted vs. the fitness values obtained from the fit with the model described 
by equation (|3]), /f . For a perfect fit, all dots would lie on the straight line with slope 
1. The upper panel uses the measurements taken directly from [56]. A systematic 
deviation from the straight line is observed. The lower panel uses the logarithms of 
the measurements; no systematic deviations are observed. 



however, depend on the quantity one measures as a proxy of fitness. For example, 
when the hnear model is fitted to the landscape E of [56], which is based on measures 
of the minimal inhibitory concentration (MIC) of an antibiotic, systematic deviations 
from the measured landscape result. A much better fit was obtained by considering 
the logarithms of the same MIC values, implying that, in this case, the interaction- free 
landscape is closer to a multiplicative than an additive model (see fig. [3]). Since there 
is no general theory predicting how mutational effects should combine for the different 
proxies of fitness, we consistently applied the logarithmic transformation. For all fitness 
measurements based on concentrations of drugs or toxins that limit growth or survival, 
like MIC values, the logarithms much improve the fit to the linear model. In the other 
cases, the logarithms did at least did not worsen the fit. Note that the MIC values for 
the combination of piperacillin and an inhibitor listed in [57] are already the logarithms 
of the measurements. 

In the datasets H-J, fitness proxies are missing for several genotypes (tabled]). The 
cause is either non-viability of those genotypes (dataset H) or unobserved genotypes 
(dataset I- J). In the latter case, we assume that the unobserved genotypes were 
missed by chance [59]. We therefore replaced the missing measurements by values 
obtained from the fitted multidimensional linear model and subsequently performed 
the logarithmic transformation. In the case of non-viability this poses a problem for 
the log transformation of the fitness measurements. Dataset H was shown to contain 
non- viable A. niger genotypes on the basis of a statistical analysis jSS]. A non- viable 
genotype would imply a logarithmic fitness equal to minus infinity. To circumvent 
this issue, we did not perform the log transformation for this dataset. Because the 
fitness values of viable genotypes (expressed in terms of relative growth rates) were 
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Table 2. Quantitative measures of ruggedness and epistasis for 10 empirical fitness 
landscapes. Except for the landscapes of size L — A, the reported values are averages 
over all possible 4-locus subgraphs. The last two lines show reference values obtained 
from simulations of the house-of-cards model with L — A (HoC), and for a perfectly 
additive landscape (PA). 



ID 


Ref. 


L 




r/s 








F2 


sum 


A^max 


A^cp 




fr 




fs 


fmm 


A 


[25] 


4 


0, 


.122 


0. 


,989 


0, 


,009 


0.011 




1 
1 


24 












1 


B 


1551 


5 


0, 


.290 


0. 


,942 


0. 


,040 


0.058 


1, 


.10 


16.80 


0. 


,013 


0, 


,150 


1.92 


C 




4 


0, 


.517 


0, 


.267 


0, 


,400 


0.733 




2 


16 


0. 


,083 


0, 


,250 


0.67 


D 


[26J^ 


4 


0, 


.986 


0, 


.537 


0, 


,197 


0.463 




2 


10 


0, 


,125 


0, 


,458 


0.67 


E 


[56J 


5 


0, 


.418 


0. 


,894 


0, 


,064 


0.106 


1, 


.50 


6.88 


0. 


,025 


0, 


,150 


1.09 


F 


[57J^ 


5 


0, 


.380 


0, 


.921 


0, 


,061 


0.079 


1, 


.30 


8.75 


0. 


,050 


0, 


,250 


3.03 


G 


[58] 


6 


1, 


.180 


0. 


,658 


0, 


,179 


0.342 


2, 


.13 


3.51 


0. 


,229 


0, 


,358 


3.16 


H 


m 


8 


1, 


.304 


0. 


,547 


0, 


,269 


0.453 


2, 


.61 


3.80 


0. 


,154 


0, 


,262 


2.19 


I 


[59]'^ 


9 


1, 


.317 


0, 


.376 


0, 


,368 


0.624 


2, 


.66 


3.80 


0. 


,240 


0, 


,292 


1.71 


J 


[59] 


9 


1, 


.199 


0, 


.383 


0. 


,372 


0.617 


2, 


.48 


4.21 


0. 


,227 


0. 


,300 


1.92 


• 


HoC 


4 


2, 


.423 


0, 


.267 


0, 


.402 


0.733 


3 


.20 


2.85 


0. 


,333 


0, 


,333 


2.20 


□ 


PA 


4 









1 












1 


24 












1 



^ Pyrimethamine resistance measurements. 

^ Growth rate measurements. 
Data for piperacillin resistance in the presence of a /3-lactamase inhibitor; these mutations 
were originally selected for their beneficial effect on cefotaxime resistance 
Relative 5-epi-Aristolochene output (main product of TEAS terpene synthase). 
Relative Premnaspirodiene output (main product of HPS terpene synthase). 

fairly close to unity, taking the logarithm of the fitness values of viable genotypes would 
not substantially alter the results. 

The results of the standardized analyses are presented in table El In short, we 
observe that landscapes obtained by combining genome-wide mutations with a known 
collectively beneficial effect are more smooth. In fact, the landscapes A and B share 
these two characteristics and have the lowest ruggedness for all four measures (see fig. 
H]). The landscapes obtained by combining mutations with a known beneficial effect from 
a single gene (data sets C-F) are more rugged, while the highest degree of ruggedness is 
measured in landscapes constructed from genome- wide mutations with deleterious (data 
sets G and H) or unknown (data sets I and J) effects. Before turning to the biological 
implications of these trends in sect. HI we find it useful to establish the correlation 
between the different measures (sect. I3.2.3P and the fit to the RMF model (sect. I3.2.4p . 

3.2.3. Correlation between different measures of landscape ruggedness. To investigate 
how well the different measures (except /mm) correlate to one another, we first rank all 
landscapes for each measure separately; i.e., if a landscape has the n-th lowest value 
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Figure 4. The ranks of the fitness landscapes specified in table [2] with respect to the 
studied quantifiers of epistasis are plotted against each other. In general, the quantities 
seem rather well correlated (see main text for a detailed discussion). 



for a quantity, it is assigned rank n with respect to that quantity. In fig. Hj we make 
pairwise plots of these ranks for the different measures. For a perfect rank correlation 
between the measures, the symbols should lie on a straight line. In general, the different 
measures of ruggedness correlate well, suggesting that these quantities all reflect the 
relative contribution of epistasis in a similar wa}{||J. The number of maxima N^ax even 
has a perfect correlation with the roughness to slope ratio r/s. The number of crossing 
paths, iVcp, correlates somewhat less well with the other quantities. We will examine 
this deviation when we compare the measured values with expectations from model 
landscapes. 

It is also instructive to compare data sets that measure different quantities using 
the same set of genotypes. Landscapes C and D are based on measurements of drug 
resistance and growth rate, respectively. The mutations in this set of genotypes were 
selected for their beneflcial effects on resistance. Increased resistance is expected to have 

1 1 This conclusion differs from a related analysis in [18] , where little or no correlation between different 
roughness measures was found for a family of landscapes based on protein folding. 




Figure 5. The measured values for the studied quantifiers of epistasis are plotted 
with respect to each other for the fitness landscapes specified in tabled The black line 
corresponds to values obtained numerically for RMF landscapes interpolating between 
a non-epistatic (open square) and a completely random (filled circle) landscape. 



a trade-off in the absence of the drug and since growth rates were determined under 
these conditions, the included mutations are no longer beneficial. Given the tendency 
that landscapes from beneficial mutations are more smooth, it is not surprising that the 
growth rate landscape D is more rugged than the resistance landscape C. Landscapes E 
and F are based on a genotype with multiple mutations that is selected because of the 
increased resistance to a particular /3-lactam antibiotic, cefotaxime. The fitness proxy 
used to construct landscape E is cefotaxime resistance, whereas in landscape F it is 
resistance to another antibiotic, piperacillin, which was not involved in the selection 
of the genotypes. However, the reason why they were measured in the piperacillin 
environment (with inhibitor) is because resistance in this environment showed an overall 
trade-off with cefotaxime resistance. Hence, the included (reverse) mutations were 
collectively beneficial for this environment. In contrast to landscapes C and D, the 
landscapes E and F turn out to be almost equally rugged. 
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3.2.4- Combining models and empirical data. To compare the measurements for 
the empirical landscapes with expectations calculated for model landscapes, we use 
the predictions generated by the RMF model ([2]). We thereby fixed the parameter 
controlling the roughness of the landscape to a = 0.01, and calculated all measures 
for various choices of the slope c. For each pair of a and c, the calculated values were 
averaged over 10,000 realizations of the 4-locus landscape. Recall that the case c = 
corresponds to a completely random landscape, i.e. to the House-of- Cards model. The 
opposite extreme of a purely additive landscape was also considered by setting a = 
for an arbitrary value of c 7^ 0. 

In fig. [5l we plot all pairwise combinations of the four epistasis measures previously 
included in the rank analysis of fig. HI The black line corresponds to the range of possible 
outcomes for the model landscapes, one limit corresponding to the House-of- Cards case 
(marked by a filled circle) and the other limit being the purely additive case (marked 
by an empty square). The letters represent the measurements from the experimental 
landscapes (see table [2]). The close correspondence between the letters and the line 
indicates that the RMF model captures the different ruggedness measures and their 
correlations observed for the experimental landscapes surprisingly well. The relatively 
large deviations for landscapes C and D from [26] are most likely due to the fact that 
this is a 4-locus landscape and measurements are thus based on a single observation, 
rather than being an average over multiple subgraphs of size m = 4 as is the case for 
the larger landscapes. We also note that the number of crossing paths N^p observed 
in the empirical landscapes appears to be systematically smaller than predicted by the 
RMF model, a deviation which coincides with the relatively low rank correlation of this 
measure compared to the other measures (fig. H]) . 

4. Discussion and outlook 

In this review, we have first established a set of standardized measures to determine the 
ruggedness of fitness landscapes. We then use these measures to compare the ruggedness 
of ten available empirical landscapes, and to compare the empirical landscapes to 
predictions generated by the Rough Mount Fuji (RMF) model, a model with tunable 
ruggedness. Our rank analyses shows that the selected measures correlate very well, and 
thus appear to capture the same underlying feature of the landscape. In a sense, they all 
capture the amount of epistasis in a particular landscape. What is quite remarkable is 
that not all measures are sensitive to detect all types of epistasis. The Fourier analysis 
and the r/s ratio are sensitive to magnitude epistasis, sign epistasis and reciprocal sign 
epistasis. The local epistasis measure, fr + fs, and measures of accessible pathways 
are insensitive to magnitude epistasis, while the number of local fitness maxima is 
only sensitive to detect reciprocal sign epistasis. The fact that the rank correlations 
between the different measures are still high could either mean that the effects of sign 
epistasis dominate the measures that are sensitive to magnitude epistasis or that the 
three type of epistasis co-occur. The above also implies that the measure of epistasis 
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among two loci (/^ + fs) contains similar information as the more global measures of 
epistasis (involving 4 loci). Sampling local interactions can be done by detection of 
pairwise interactions between mutations, which is experimentally more straightforward 
than building multidimensional landscapes of connected genotypes. On the other hand, 
the Fourier analysis shows that higher order interactions between mutations (F3, F4, 
etc.) play a significant role, especially in the more rugged landscapes. This information 
can only be detected by the construction of such landscapes. 

Before we discuss which characteristics of mutations are either linked with 
smooth or rugged landscapes, we need to emphasize that the small number of 
available landscapes only allows for preliminary conclusions and that the comparison 
is complicated by differences in the methodologies involved to measure fitness. 
Nevertheless, general patterns do emerge as well as gaps in our knowledge. All 
empirical landscapes are relatively small in size and a specific set of mutations is used to 
construct the genotypes. These mutations fall into different classes. Unfortunately, few 
representatives are available per class, and worse: we lack any data for other classes. 
For example, no landscapes are available using mutations known to be beneficial by 
themselves in a particular wildtype, nor do we know of any studies that constructed 
landscapes from deleterious mutations in a single gene. This necessarily limits the 
interpretation and generality of our findings. 

A first characteristic of a mutation which affects ruggedness is whether the mutation 
is deleterious [331 ESI or beneficial [26], |25l [551 EH [56]. Among the available landscapes, 
those that are constructed using beneficial mutations (A-C and E-F in Table 2) are 
smoother than those using deleterious mutations (G-H). Although beneficial mutations 
are much rarer than deleterious ones [53l[78], most studies focus on beneficial mutations. 
This seems justified given that beneficial mutations do account for a large fraction of the 
mutations that contribute to long-term evolution [79l[80|. Both types of mutations are 
intrinsically linked, since each fixed beneficial mutation becomes a potential deleterious 
mutation when the direction of selection reverses. In that global sense, it does not 
matter which type we are dealing with. However, it does matter in which direction 
one proceeds, as deleterious and beneficial mutations sample different parts of the 
local fitness landscape. Similarly, the position of the wild type is of influence. For 
example, the beneficial mutations in the TEM-1 /3-lactamase fitness landscape E increase 
resistance to the antibiotic cefotaxime [56]. The TEM-1 wild type incurs a very low 
resistance towards this antibiotic, and the fittest genotype has approximately a 100,000- 
fold higher resistance. This clearly differs from the study by Chou et al. [25] (landscape 
A) in which a new metabolic pathway is introduced into a strain of Methylobacterium 
extorquens and the fittest genotype displays a 94% (i.e. ~ 2-fold) fitness increase. Note 
also that the empirical fitness landscapes either include mutations that are known to 
alter fitness (individually or collectively), and mutations with an individual neutral 
effect are excluded. Still, neutral mutations make up a significant portion of all available 
mutations [81], and are known to contribute to long-term adaptation |8j. 

A second characteristic that appears to influences the degree of ruggedness is 
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whether the individual or the combined effect (beneficial or deleterious) is known. For 
example, the mutations that were studied by Chou et al. [25] and Khan et al. [55] 
(landscape B) collectively produced a well-adapted genotype after many generations 
of evolution, whereas the mutations studied by [33] (landscape H) were a priori only 
known to have a deleterious effect in the wild-type background. In the first category, all 
intermediates are constructed between two points in genotype space that are connected 
by at least one accessible pathway (otherwise the higher-fitness genotype would not have 
been found). In the second category, the genotype that combines all mutations is not 
necessarily accessible from the wildtype, and does not even have to be better adapted. 
Consistent with an expected greater bias against (sign) epistasis in the first category, the 
landscapes based on genome-wide collectively beneficial mutations show less epistasis 
than those based on individually deleterious mutations (see table [2]). A better direct test 
would be a comparison between mutations with known collective or individual effect of 
the same fitness sign (all beneficial or all deleterious) within the same biological system, 
but we presently lack such data. 

A third distinction that affects ruggedness of empirical fitness landscapes is the level 
at which the mutations interact. The included mutations can for example affect fitness 
[33| [55] , can operate in a common genetic pathway [25], or can even be located in the 
same gene [261 EH ES] . The landscapes constructed from beneficial mutations located in 
different genes (A-B) are smoother than those from beneficial mutations located in the 
same gene (C and E-F). When epistasis is detected between mutations in different genes, 
this information has traditionally been used to infer their combined contribution to a 
metabolic pathway. The reverse is also true: when empirical fitness landscapes combine 
mutations that operate in a common genetic pathway, finding epistasis becomes more 
likely. This becomes even more prominent when mutations are located in a single gene. 
Epistasis among mutations in different genes can result from functional constraints 
caused by interactions in a metabolic network [821 [83|, whereas intragenic epistasis can 
also result from structural constraints when nucleotide positions in a single gene have 
a combined effect on protein shape, enzyme activity, or folding-stability [Ml 185] . This 
relates to the type of epistasis that one expects to find. Magnitude epistasis is often 
associated with mutations in different genes in a metabohc network [53l IM1 EI] , whereas 
sign epistasis is expected to occur more often between positions in a single gene [88] . 
The expectation of a greater contribution of epistasis to landscapes based on mutations 
in a single gene versus in different genes is supported by our analysis (see table [2]). Note 
however, that compensatory mutations are often located in different genes [52ll89], and 
that sign epistasis between deleterious mutations has also been shown to occur at a 
genome-wide scale [33] . 

Having in mind all complicating differences between the various fitness proxies and 
the diverse set of biological organisms used for testing, it is all the more surprising how 
well the simple model ([2]) seems to capture features of the real landscapes. However, we 
emphasize that we studied averaged quantities of small (i.e. 4-locus) subgraphs of the 
landscapes in order to standardize our measures and compare them to model predictions. 
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Hence information contained in the full landscape might have been overlooked in these 
analyses. For example, the considerations in the previous paragraph suggest that the 
level of interaction between mutants should be distributed very inhomogeneously on 
large enough landscapes, consistent with predictions from metabolic models [ll|83]. This 
means that the landscapes can be decomposed into subgraphs, some of which contain 
much, others little or no epistasis. Such a decomposition would reflect the strength of 
interactions between specific combinations of mutations. For instance, one would expect 
that mutations changing the same functional part of one protein should highly influence 
the impact of one another on the function. On the other hand, the impact of mutations 
altering different proteins, which do not interact, should be independent of each other. 
Searching for such patterns by looking at distributions of epistasis measures instead of 
their mean, is a promising direction for future study. 

If the existence of genetic modules with different levels of epistasis can be established 
empirically, it will be a challenge for future models of fitness landscapes to take 
this realism into account, and study its evolutionary consequences. In fact, the 
LK model [27, 28] was introduced with the idea of incorporating such structures. 
However, these models make very specific assumptions about the distribution of the 
size and coupling of different epistastic modules, and these should be compared to 
measures based on empirical landscapes. Where systematic deviations are observed, the 
empirical information may then be used to adapt the models. Much of the progress in 
understanding fitness landscapes and their evolutionary implications, therefore, depends 
on the availability of additional empirical landscapes from various systems - particularly 
from those classes where we lack any information. 
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